Skip to content

feat(payments): retry UX gaps — idempotency-key reuse, attempt counter + cooling, error categorization, offline banner, audit log (#43)#63

Merged
TortoiseWolfe merged 5 commits intomainfrom
043/payment-retry-ux-gaps
Apr 28, 2026
Merged

feat(payments): retry UX gaps — idempotency-key reuse, attempt counter + cooling, error categorization, offline banner, audit log (#43)#63
TortoiseWolfe merged 5 commits intomainfrom
043/payment-retry-ux-gaps

Conversation

@TortoiseWolfe
Copy link
Copy Markdown
Owner

Summary

Closes 5 of 7 gaps in #43. The route itself shipped 2026-04-16 (commit `ffb33a1`); this PR closes the retry-UX gaps that were obscured by the original 'route missing' framing (corrected in #62).

Closes (gaps 1-5 of #43):

  1. Retry button regenerates `idempotency_key`. `retryFailedPayment()` now reuses the parent's queued key, so a doubled click or a click that races a webhook becomes a server-side ON CONFLICT no-op instead of a duplicate charge. Same pattern as the offline-queue adapter that landed in PR fix(offline-queue): watchdog reclaim + idempotent payment INSERTs (#52) #59.
  2. No retry attempt counter or cooling period. `payment_intents` gains `retry_count` + `parent_intent_id`. Service refuses past `RETRY_LIMIT=3` (FR-009); enforces a `COOLING_PERIOD_MS=30s` since parent.created_at (FR-010). New custom error classes carry `waitMs` for UI countdown (FR-008).
  3. No error categorization. New `src/lib/payments/error-categorization.ts` maps Stripe + PayPal `error_code` to one of FR-002's 8 categories with non-technical user message + resolution hint + a `recoverable` flag that gates the retry button (FR-001/003/006/NFR-001).
  4. No offline error banner. New `OfflineRetryBanner` 5-file component reads `useOfflineStatus` + `paymentQueue.getCount()`. Mounted on `/payment-result`; silent in the steady state.
  5. No audit log on retry. `auth_audit_logs` CHECK extended with `payment_retry`. Every retry attempt is recorded with `{ original_intent_id, new_intent_id, retry_count, deduped }` so admin analytics can distinguish a fresh attempt from a server-side dedupe.

Bonus: An expiry guard rejects retry of a parent past its 24h provider session — `PaymentRetryExpiredError`. The same-key retry would have succeeded at the DB but failed at the provider redirect; refusing here gives a clear message instead of a confusing failure.

Deferred to a follow-up PR (gaps 6-7):

  • User Story 3 (Update Payment Method, FR-011-FR-015) — has stripe.js + PCI surface considerations that warrant their own design pass.
  • User Story 4 (Guided Recovery Wizard, FR-016-FR-019) — largest piece by far.

Commit shape

3 logical commits, easy to review independently:

  1. `1bd6949` — schema + types + audit-event enum (~5 files)
  2. `faadd2c` — retry service + audit + categorization libs + tests (~5 files, 750 LOC incl. tests)
  3. `0653839` — UI surface: hook + status display + offline banner + page mount + e2e (~12 files, 928 LOC incl. tests)

Schema changes

Idempotent ALTERs to `supabase/migrations/20251006_complete_monolithic_setup.sql` (no new tables, no migration files):

  • `payment_intents.retry_count INTEGER NOT NULL DEFAULT 0`
  • `payment_intents.parent_intent_id UUID REFERENCES payment_intents(id) ON DELETE SET NULL` + partial index
  • `auth_audit_logs.event_type` CHECK extended with `'payment_retry'` (DROP + ADD on existing DBs, inline on fresh)

Applied to local Supabase via psql as supabase_admin. Cloud application is the reviewer's call — the migration is idempotent so re-applying is safe; the new columns inherit existing RLS policies (no new policies needed). RLS verification: `pnpm test:rls` 55/55 green locally.

Test plan

E2E — #53 progress

Un-skipped `should display offline error banner when offline` in `tests/e2e/payment/03-failed-payment-retry.spec.ts`. Closes one row of #53's skip index. The Stripe-Checkout-required skips remain pending those API keys.

Sharp edges

  • Cooling period is enforced both client-side (button disabled + countdown) and server-side (thrown error). The hook is a UX hint; the service is the security boundary. A user who tampers with client state still hits the same checks.
  • `payment_intents` is INSERT-only. `retry_count` is set on the new (child) row at INSERT time — the parent's count is never mutated. The 'Payment intents are immutable' UPDATE policy enforces this.
  • Upsert with `ignoreDuplicates: true` returns null on conflict. The service surfaces that as 'the original intent is the truth' by returning the parent. No double-charge. This matches the offline-queue adapter pattern and the fix(offline-queue): watchdog reclaim + idempotent payment INSERTs (#52) #59 dedupe contract.

🤖 Generated with Claude Code

TurtleWolfe and others added 5 commits April 28, 2026 00:42
…it event

Adds the database surface that the next two commits build on. Idempotent
ALTERs to the monolithic migration; new columns inherit existing RLS
policies (verified by pnpm test:rls — 55/55 green).

Schema:
- payment_intents.retry_count INTEGER NOT NULL DEFAULT 0
- payment_intents.parent_intent_id UUID REFERENCES payment_intents(id)
  ON DELETE SET NULL, with a partial index for parent-lookup queries
- auth_audit_logs.event_type CHECK extended with 'payment_retry'

Both fresh-DB (CREATE TABLE inline) and existing-DB (ALTER ... DROP +
ADD CONSTRAINT) paths are covered. Applied to local Supabase via psql
as supabase_admin; cloud application deferred to PR review per the
\"local-first sandbox\" memory rule.

Types:
- src/types/payment.ts — PaymentIntent gains retry_count, parent_intent_id,
  idempotency_key (the last was already in the DB from PR #59 but had not
  been mirrored into the domain type)
- src/lib/supabase/types.ts — same three columns added to Row/Insert/Update,
  plus the new self-referential FK Relationship entry
- src/lib/auth/audit-logger.ts — AuditEventType union gains 'payment_retry'

Test fixtures:
- tests/unit/payment-service.test.ts — isPaymentIntentExpired fixture
  needs the three new required fields

Why this commit on its own: schema + type contract is the ground truth
that the service and UI commits build on. Reviewing it standalone makes
the rest of the PR cheaper to read.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…, cooling, expiry guard

Closes the load-bearing gap from #43: the retry button was generating a
fresh idempotency_key on every click, defeating the partial unique index
that landed in PR #59. A doubled click or a click that races a webhook
could produce two intents → two charges. Same-key reuse turns those
races into ON CONFLICT no-ops.

retryFailedPayment() now:
- reuses parent.idempotency_key (FR-006); if absent (legacy intent), gens
  a fresh one and the retry chain still self-dedupes
- bumps retry_count on the new (child) row; throws PaymentRetryLimitError
  past RETRY_LIMIT=3 (FR-009)
- enforces COOLING_PERIOD_MS=30s since parent.created_at; throws
  PaymentRetryCoolingError carrying waitMs so UI can render countdown
  (FR-010)
- guards parent.expires_at; throws PaymentRetryExpiredError if the parent's
  24h provider session has lapsed
- upserts with onConflict:'idempotency_key', ignoreDuplicates:true — a
  doubled click returns null which we surface as \"original is the truth\"
  by returning the parent intent (no double-charge, no UI surprise)
- audit-logs every attempt as 'payment_retry' (NFR-007), including the
  deduped flag so admin analytics can distinguish a fresh retry attempt
  from a server-side dedupe of an already-won race

New libs:
- src/lib/payments/error-categorization.ts — pure mapping from Stripe +
  PayPal error_code (already in payment_results) to one of FR-002's 8
  categories with non-technical user message + resolution hint + a
  recoverable flag that gates the retry button (FR-001/003/006/NFR-001).
  27 tests including 'never leak raw provider text' and 'no technical
  jargon'
- src/lib/payments/audit.ts — logPaymentRetryEvent wrapper around
  logAuthEvent. Non-throwing; an audit-write failure must never break
  payment UX

Tests: 16 new in retry.test.ts (key reuse, cap, cooling, expiry,
dedupe-conflict path, audit emission), 27 in error-categorization.test.ts.
All 92 payment-service-area tests green; full suite 3249/3249.

Why on its own: this is the load-bearing change. UI commit follows and
just consumes the public surface (RETRY_LIMIT, error classes, hook).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ter, cooling button, offline banner

Consumes the service surface from the previous commit to surface the
retry UX gaps that #43 actually closes.

PaymentStatusDisplay (failed-state block rewritten):
- Renders categorized userMessage + resolutionHint instead of raw status
  (FR-001/003). Transaction reference shown for support inquiries (FR-004).
- Recoverable errors show retry button + 'Attempt N of M' counter (FR-008).
  Non-recoverable errors hide retry and show a Contact Support link
  (FR-019 lite — no wizard yet, US4 is a follow-up).
- Cooling state disables the button and shows 'Try again in 23s' that
  ticks down (FR-010). PaymentRetryLimitError / PaymentRetryCoolingError /
  PaymentRetryExpiredError thrown by the service surface client-side as
  retry-error alerts.

usePaymentRetryStatus hook:
- Reads parent intent, computes { retryCount, canRetry, disabledReason,
  coolingMsRemaining }. 1Hz tick while cooling so the countdown is live.
- A UX hint, not a security boundary — server-side checks in
  retryFailedPayment are still the source of truth. The hook just stops
  the user clicking a button that's certain to fail.
- 7 tests using vi.useFakeTimers({ shouldAdvanceTime: true }) so waitFor
  internals work alongside the controlled cooling tick.

OfflineRetryBanner (new 5-file component, peer of PaymentStatusDisplay):
- Reads useOfflineStatus + paymentQueue.getCount(). Renders a warning
  alert when offline (with queued count if any) and an info alert when
  online but queue still draining. Silent in the steady state.
- Mounted above PaymentStatusDisplay on /payment-result for the loaded
  state and above the not-found alert for the still-processing state.
- 7 unit + 1 a11y test, including singular/plural copy and a getCount
  rejection survives without throwing.

E2E (tests/e2e/payment/03-failed-payment-retry.spec.ts):
- Un-skipped 'should display offline error banner when offline' (was
  skipped pending offline UI). Uses context.setOffline(true) to force
  the navigator.onLine bit before navigation, asserts the banner copy.
  Closes one row of #53's skip index.

Verification:
- pnpm test: 3249/3249 across 286 files
- pnpm test:rls: 55/55 (new columns inherit existing policies as expected)
- pnpm run type-check: clean
- pnpm run lint: clean

Out of scope (deferred to a follow-up PR):
- User Story 3 (update payment method, FR-011-FR-015) — stripe.js + PCI
- User Story 4 (guided recovery wizard, FR-016-FR-019) — largest piece

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The test as shipped called context.setOffline(true) BEFORE page.goto(),
which works on a dev server (page is in cache after first hit) but fails
in CI's static-export flow with net::ERR_INTERNET_DISCONNECTED — there's
no service-worker cache warmed for this URL when the page first loads.

Fix: navigate online first, wait for the page to render the loaded /
not-found branch (both of which mount OfflineRetryBanner), then flip
offline. useOfflineStatus listens for the browser 'offline' event so the
banner re-renders without another navigation.

Same story as the diagnostic-loud failure modes PR #58 was designed to
catch — the failure was clear from the CI log so this is a one-shot fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…wser (#63)

Previous fix navigated online first, then called context.setOffline(true).
That works in Chromium (chromium-gen 3/6 went green) but fails in Firefox
and WebKit because Playwright's setOffline blocks request traffic without
firing the 'offline' browser event in those engines. useOfflineStatus only
listens for window 'offline' events, so the banner never re-renders.

Real browsers DO fire the event on real network drops, so synthesizing it
in the test is closer to production behavior than relying on Playwright's
imperfect cross-browser setOffline. Override navigator.onLine + dispatch
both 'offline' and (in cleanup) 'online' so subsequent tests don't see
stuck state.

Single-shard verification was the chromium-gen 3/6 success in the prior
attempt (run 25030125601); this commit extends the same path to
firefox/webkit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@TortoiseWolfe TortoiseWolfe merged commit 75f0d69 into main Apr 28, 2026
29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants